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1.  Introduction 


In  a  number  of  statistical  problems,  it  is  desired  to  know  the  probability  of  a  union 

n 

of  n  events  ;  /’{(J  Aj}  where  the  Aj  are  undesirable  events  such  as  rejection  of  the 
i=i 

jth  null  hypothesis  when  it  is  true,  or  the  jth  confidence  interval  not  covering  the  true 

n 

parameter.  For  many  of  these  situations,  it  is  impossible  to  calculate  Aj}  exactly 

>=  J 

due  to  the  numerical  inability  of  integrating  over  n  events  or  incomplete  knowledge  of  the 
entire  union/intersection  structure  among  these  events.  When  this  does  occur,  an  attempt 

n 

is  made  to  be  conservative  and  obtain  an  upper  bound  for 

j=i 

It  may  be  feasible  to  integrate  over  k  or  less  events  or  otherwise  determine  P{Aj^  U 
i4j,  U  •  ■  •  U  Aj^^ }  or  P{Aj^  fl  Aj^  fl  •  •  •  fl  Aj^^  }  for  <  k.  This  information  may  then  be 

n 

used  in  some  fashion  to  derive  the  upper  bound  for  P{  Aj}.  Many  approaches  for  doing 

this  have  previously  appeared  in  the  literature.  The  earliest  was  the  inclusion-exclusion 
formulas  of  Boole(1854)  and  Bonferroni(1936)  stating  : 


(1) 


-  T,PUi,nAi,}  + 

h<}i 

n  Aj.^  n  Aj,} 

}i  <h<h 

-+•■■+  Y.  P{Aj,nAj,D...nAj,} 


where  k  is  an  odd  positive  interger.  The  most  familiar  of  these,  of  course,  is  the  Standard 
Bonferroni  Inequality  where  fc  =  1. 


>=i  >=i 

One  problem  with  this  approach  is  that  the  number  of  terms  one  must  calculate  to 

implement  this  formula  is  f  .  j  which  becomes  excessive  as  k  becomes  large.  For 

j=i  ' 

instance,  when  n  =  10,  if  fc  =  1  then  10  terms  must  be  calculated,  if  fc  =  5  then  637  terms 
must  be  calculated. 


Another  problem  is  that  the  upper  bound  given  by  the  inclusion-exclusion  formula 
does  not  necessarily  become  lower  as  k  becomes  larger.  For  example,  consider  10  events 
with  the  probability  of  any  single  event  occuring  equal  to  0.08,  the  probability  of  any  two 
events  both  occuring  equal  to  0.04  and  the  probability  of  any  three  events  all  occuring 

n 

equal  to  0.02;  then  the  inclusion-exclusion  upper  bound  with  k  =  1  for  the  Aj}  is 

0.80,  while  the  inclusion-exclusion  fc  =  3  upper  bound  for  this  probability  is  1.40.  Not  only 
is  1.40  >  0.80,  but  1.40  >  1  -  an  upper  bound  for  any  probability. 

A  different  approach  was  developed  by  Kounias  and  Marin(1976)  and  modified  by 

n 

Tydeman  and  Mitchell(1981).  It  formulates  the  upper  bound  of  Aj}  as  a  linear 

i=i 


program  with  2"  nonnegative  variables  and  ^  equality  constraints.  Using  this  for¬ 
mulation  will  produce  the  lowest  possible  linear  upper  bound  for  a  given  set  of  probability 
information.  However,  even  for  moderate  values  of  n  and  k,  this  linear  program  will  be 
too  complicated  to  be  conveniently  evaluated.  In  fact,  no  attempt  has  been  made  to  use 

this  method  with  k  larger  than  2.  This  approach  also  requires  knowledge  of  ^  j 

j=i  ' 


probabilities  which,  as  stated  before,  may  be  too  many  terms  to  calculate. 

It  is,  therefore,  of  interest  to  find  methods  incorporating  knowledge  of  k  event  inter- 
section/union  probabilities  to  produce  easily  calculatable  upper  bounds  for  n  event  union 
probabilities  which  are  lower  than  those  upper  bounds  currently  used.  One  such  formula 
has  been  developed  for  fc  =  2  by  Hunter(1976).  It  gives: 


J=I  j=l  e,,,T 

where  n  is  finite,  T  is  any  spanning  tree  with  vertices  Aj ,  /I2,  •  •  • ,  A„;  and  Ai  is  connected 
to  Aj  in  T  by  edge  Cij.  Several  articles,  including  those  by  Stoline(1983)  and  Bauer  and 
Hackel(1985),  have  been  written  evaluating  and  implementing  this  method.  Hoppe(1985) 
and  Tomescu(1986)  expanded  on  this  procedure  to  develop  lower  bounds  for  probabilities 
of  unions.  Tomescu(1986)  also  developed  related  inequalities  which  utilize  probabillies 
involving  fc  >  2  events  to  give  upper  boimds.  These  bounds,  however,  become  very  com¬ 
plicated  as  fc  becomes  large,  and  like  the  inclusion-exclusion  inequalities,  do  not  necessarily 


4 


decrease  with  k. 

In  section  2,  a  general  method  is  developed  which  expands  Hunter’s  idea  io  k  >  2  and 
n  possibly  infinite.  For  a  fixed  value  of  n,  the  number  of  probabilities  needed  to  apply  this 
new  method  is  a  decreasing  function  of  k.  It  is  also  shown  that  when  using  this  method,  k 
can  be  increased  resulting  in  at  worst  no  improvement  in  the  upper  bound.  In  section  3, 
the  new  algorithm  is  applied  to  simultaneous  confidence  intervals  and  multiple  hypothesis 
testing  involving  multivariate  normal  (and  t)  distributions. 

2.  The  New  Upper  Bounds 

Consider  the  following  representation  of  the  probability  for  a  union  of  n  events  (with 
n  possibly  oo  ). 

J=1 

n 

j=k+i 

Define  the  set  Sj  to  contain  k  —  1  elements  from  1,2,  •  •  •  —  1  for  j  >  k  +  1.  Without  loss 

of  generality,  let  these  elements  be  I'l  <  i2  <  •••  <  ik-i-  It  is  true  that; 

{Aj.i  U  Ai)  D(^i,  UAi,  U---U 

which  implies  that: 

{Aj-i  UAj-2  U---UAif  C  (>1.-,  U  Ai,  U  •••U 

and  thus  that; 

P{.4,n(.4,_,  U>l,_2U---U.4,r}  <  P{A,  f|  (.4.,  u.!.,  U-  -.4„^.r} 

ti  <ii<  ■  <ii,  _  1 
«l  -  1  GS, 

from  this  it  follows: 

Theorem  1.  Subset  Complement  Addition  Upper  Bound  (SCAUB) 
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(3) 


P{[J^^}  =  P{U/ly}+  ■£ 

i=i  j=i  j=fc+i 

k  n 


j=i  j=k+i 


P{Aj  n  (Aj_,  u  a^_2  u  •  •  •  u  a,  )"} 


P{Aj  fl  (A,,UA,,U...UA,,_.r} 

»lt*J  1  65y 


where  k  is  any  positive  integer  smaller  than  n  and  Sj  is  a  set  with  fc  —  1  elements  from 
-  1)  for  j  >fc  +  l. 

This  bound  is  called  a  subset  complement  addition  upper  bound  (SCAUB)  since  it  is 
created  by  adding  probabilities  of  intersections  of  new  events  with  complements  of  unions 
of  subsets  of  events  that  have  already  been  incorporated  into  the  bound.  The  SCAUB  can 
be  shown  to  be  a  distribution  free  analog  of  Glaz  and  Johnson’s(1984)  product  type  bounds 
for  Multivariate  Totally  Positive  Order  Two  (AITP2)  distributions.  See  Glaz(1987)  and 
Hoover(1988).  To  obtain  the  upper  bound  of  Theorem  1  requires  only  the  calculation  of 
n  —  k  +  I  probabilities;  each  probability  involving  k  events.  For  n  =  10  and  fc  =  5,  this 
is  6  terms  as  compared  with  637  terms  needed  to  use  the  inclusion-exclusion  upper  bound 
with  n  =  10  and  fc  =  5. 

n  n 

When  k  is  1,  the  upper  bound  of  Theorem  1  is  P{Ai  }  +  ^P{A,n((^)'=}  =  ^P{Aj} 

j=2  ;=1 

which  is  the  Standard  Bonferroni  Upper  Bound.  When  k  is  2,  the  upper  bound  of  Theorem 
1  becomes: 

n 

P{A\  U  A2}  -i-  ^  P{^j  G  (Aj)"^}  where  i  G  Sj  and  y  >  3 
>=3 


=  P{A,}  +  P{A2}  -  P{Ai  n  A2}  +  Y, 


P{Aj)-P{A,nAi} 


7=3  >■ 


where  i  G  Sj  and  j  >  3 


n  n 

=  P{^j}  ~  ^  P{-^j  G  Aj}  where  i  G  Sj  for  j  >  3,  and  1  G  S2 

j=i  7=2 

n 

=  E E  P{Aj  n  A,}  where  i  <  j  and  e,j  G  T  iff  /  G  Sj 
7=1  f,,€T 

which  is  Hunter’s  upper  bound. 
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An  upper  bound  which  gives  the  same  value  as  the  specific  SCAUB  inequality 

with  Sj  containing  j  -  1,  j  -  2, j  —  A:  -f  1  for  j  >  A:  was  mentioned  by  Worsley(1985).  The 
form  of  this  upper  bound  is  only  given  for  A;  =  3,  but  can,  with  some  work,  be  extended 
to  all  k.  (Note  that  there  are  typographical  errors  in  the  above  article  which  make  the 
result  difficult  to  understand.)  Besides  being  more  restrictive  than  the  SCAUB  order  k 
inequality,  3^“  also  requires  the  calculation  of  2*  —  1  +  (n  —  Ar)2*‘~^  terms  to  obtain  a  k 
order  upper  bound  on  the  probability  of  the  union  of  rx  events.  When  n  =  10  and  A:  =  5, 
this  is  111  terms  compared  with  only  6  terms  needed  for  the  bound  of  Theorem  1.  Finally, 
BV  uses  probabilities  of  intersections  of  various  events  which,  in  the  normal  simultaneous 
confidence  interval  problem,  are  numerically  much  more  difficult  to  calculate  than  are  the 
probabilities  used  by  the  SCAUB  (unions  of  events  and  intersections  oi  single  events  with 
complements  of  unions). 

The  value  of  the  upper  bound  from  Theorem  1  with  k  >2  will  depend  on  the  ordering 
of  events  and  choice  of  elements  for  the  Sj.  When  k  is  two,  it  is  always  possible  to  determine 
an  ordering  and  choice  of  elements  for  the  Sj  that  gives  the  lowest  possible  Theorem  1  upper 
bound  by  using  the  Minimal  Spanning  Tree  Theorem  of  Kruskal(1956)  with  probabilities 
of  intersections  as  edge  weights  (see  Hunter(1976)).  Unfortunately,  no  method  which  will 
always  do  this  for  A:  >  2  has  been  discovered.  If  the  events  Aj ,  i42,  •  •  •  >  i4„  are  exchangeable 
(exchangeable  means  that  fl  D  •  •  •  n  =  Cm  regardless  of  choice  of  events), 

then  P{Aj  fl  (A^,  U  Aj,  U  •  •  •  U  Aj^)^}  will  be  constant  regardless  of  events;  therefore  the 
ordering  of  events  and  choice  of  elements  for  Sj  will  not  matter.  Also,  when  the  events 
have  a  natural  ordering  1, 2,  •  •  •  ,n  with  the  overlap  between  a  fixed  event  and  a  preceeding 
event  being  a  monotonically  decreasing  function  of  the  number  of  events  in  the  sequence 
seperating  them,  then  using  the  natural  ordering  with  Sj  —  {y  —  l,y  —  2,  •  •  • , j  -  A:  +  1} 
should  give  a  good  upper  bound.  This  type  of  situation  will  occur  in  Markovian  processes 
and  time  series. 

One  nice  property  of  the  SCAUB  which  inclusion-exclusion  bounds  do  not  have  is 
that  k  can  be  made  larger  with  the  upper  bound  at  worst  becoming  no  lower  and,  in  many 
cases,  becoming  much  lower.  In  other  words,  as  probabilities  involving  more  events  are 
incorporated  into  deriving  the  bound,  the  bound  becomes  better. 
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Theorem  2.  Monotonicity  of  the  SCAUB 


Let  be  a  Theorem  1  upper  bound  derived  using  a  particular  value  of  A:  :  fc  < 
n,  ordering  of  events  and  choice  of  sets  Sk+i,  Sk+2,  • '  ■ ,  Sn-  It  is  possible  to  produce  a 
Theorem  1  upper  bound  <^2  using  the  value  Ar  +  1,  the  same  ordering  of  events  and  the 
sets  contain  k  instead  of  A:  —  1  elements)  with  Sj  C  Sj  : 

k  +  2  <  j  <  n  such  that: 

n 

(4)  P  <  <^2  <  1^1 

i=i 


Proof 

The  first  inequality  in  (4)  follows  from  the  SCAUB.  To  obtain  the  second  inequality 
in  (4),  first  note  that  the  following  identity  holds  for  the  first  term  in  f2' 

fc+l  fc  r  fc  -ic 

P{U  AJ  =  P{U}  +  P{Afc+,n  IjA,  }. 

>=i  i=i  *->=1 

Next,  define  the  set  5jJ+j  to  be  (1,2,  •••,A:)  and  j*  to  be  the  unique  element  such  that 
{5^  U  j*}  =  Sj  for  j  =  A:  +  1,  A:  +  2, •  •  •  ,n.  Now  look  at  the  difference: 

n  r  icn  r  nc 

Y.  U  E  U  > 

>=fc+l  '■allies,  J=:fc+1  '■allies; 

=  j2  P{.4,n/i;n[  U  j'l 

j=fc+l  ^allies, 


>  O. 


3.  Application  to  Multivariate  Normal  Probabilities  Within  Rectangles 

The  SC/AUB  may  be  used  to  produce  upper  bounds  for  the  probability  that  the  max¬ 
imum  absolute  value  from  a  vector  of  standardized  normal  (or  t)  variables  is  larger  than  a 
given  value  c  when  the  dependence  structure  (correlation  matrix)  of  the  variables  is  known. 
Such  bounds  are  of  interest  in  simultaneous  hypothesis  testing  and  simultaneous  confidence 
intervals  involving  multivariate  normal  (or  t)  data.  In  this  case,  (.Vi ,  .V2,  •  •  • .  -Vn)  is  a  mul¬ 
tivariate  normal  vector  with  mean  zero  and  some  known  covariance.  Event  .42  is  (hat 
variable  Xj  does  not  fall  in  ilie  Interval  (-c  •  ,c  ■  ). 


Examples  of  such  bounds  for  c  =  1.96  and  2.50  ;  and  n  —  o,  8  and  10  are  given 
in  Table  1.  We  allow  k  to  be  2,  3  and  4  since  there  are  programs  by  IMSL(1982)  and 
Schervish(1984)  which  can  integrate  multivariate  normal  probability  over  rectangles  with 
up  to  four  dimensions.  Finally,  for  simplicity,  is  assumed  to  be  1  for  all  y ,  Corr{Xj,  A'i) 
is  assumed  to  be  p  and  Corr(,Vj, ,  A'i,)  is  assumed  to  be  p  for  all  t,ii,i2  S  Sj.  We  allow 
p  to  be  .3,  .5,  .7,  .9  and  .99.  As  a  comparison  to  these  upper  bounds,  the  Standard 
Bonferroni  Upper  Bound  and  the  Dunnett(1955)  Exact  Value  under  the  assumption  that 
CorT{Xi,  Xj)  ~  p  for  all  i,j  are  given.  The  numerical  values  in  Table  1  were  obtained  using 
the  IMSL(1982)  procedures  DCADRE  and  MDNOR  to  integrate  (with  an  accuracy  of  ± 
0.000001)  the  Dunnet(1955)  Exact  Value  Formula  for  equicorrelated  multivariate  normal 
distributions  . 
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Table  1.  Upper  Bounds  for  I  AM  >  c} 

where  (A"i ,  A'j,  •  •  • ,  X„)  ~  N(0,J2) 


p 

Standard 

SCAUB 

SCAUB 

SCAUB 

Exact  Value 

Bonferroni 
or  SCAUB 

k=2 

k=3 

k=4 

Equicorrelalion 

k=l 

c  =  1.96 

,  n  =  5 

0.3 

0.24997 

0.23042 

0.21846 

0.21175 

0.20891 

0.5 

0.24997 

0.21297 

0.19482 

0.18621 

0.18285 

0.7 

0.24997 

0.18379 

0.16072 

0.15150 

0.14839 

0.9 

0.24997 

0.13141 

0.11023 

0.10354 

0.10157 

0.99 

0.24997 

0.07631 

0.06739 

0.06510 

0.06449 

c  =  1.96 

,  n  =  8 

0.3 

0.39996 

0.36576 

0.34315 

0.32472 

0.29971 

0.5 

0.39996 

0.33520 

0.29890 

0.27738 

0.25013 

0.7 

0.39996 

0.28414 

0.23799 

0.21495 

0.19087 

0.9 

0.39996 

0.19246 

0.15011 

0.1333.9 

0.11904 

0.99 

0.39996 

0.09605 

0.07820 

0.07248 

0.06825 

c  =  1.96  ,  n  =  10 

0.3 

0.49996 

0.45598 

0.42628 

0.40003 

0.35122 

0.5 

0.49996 

0.41669 

0.39829 

0.33816 

0.28693 

0.7 

0.49996 

0.35104 

0.28951 

0.25726 

0.2i30o 

0.9 

0.49996 

0.23317 

0.17669 

0.15329 

0.12761 

0.99 

0.49996 

0.10920 

0.08541 

0.07740 

0.06998 
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Table  1.  (continued) 


p 

Standard 

SCAUB 

SCAUB 

SCAUB 

Exact  Value 

Bonferroni 
or  SCAUB 
k=l 

c  =  2.50  ,  n  =  5 

k=2 

k=3 

k=4 

Equicorrelalion 

0.3 

0.06210 

0.05999 

0.05857 

0.05773 

0.05773 

0.5 

0.06210 

0.05674 

0.05375 

0.05218 

0.05154 

0.7 

0.06210 

0.05021 

0.04534 

0.04341 

0.04224 

0.9 

0.06210 

0.03635 

0.03083 

0.02897 

0.02840 

0.99 

c  =  2.50 

0.06210 

,  n  =  8 

0.02030 

0.01771 

0.01703 

0.01684 

0.3 

0.09936 

0.09567 

0.09284 

0.09073 

0.08712 

0.5 

0.09336 

0.08998 

0.08400 

0.08008 

0.07456 

0.7 

0.09336 

0.07855 

0.06882 

0.06640 

0.05735 

0.9 

0.09336 

0.05430 

0.04326 

0.03860 

0.03439 

0.99 

c  =  2.50 

0.09936 

,  n  =  10 

0.02640 

0.02102 

0.01934 

0.01803 

0.3 

0.12420 

0.11945 

0.11568 

0.11273 

0.10553 

0.5 

0.12420 

0.11214 

0.10416 

0.09868 

0.08805 

0.7 

0.12420 

0.09745 

0.08447 

0.07838 

0.06557 

0.9 

0.12420 

0.06626 

0.05155 

0.04502 

0.03742 

0.99 

0.12420 

0.03014 

0.02323 

0.02088 

0.01858 

The  calculations  for  the  entry  in  Table  1  with  n  =  8,  fc  =  3,  c  =  1.96  and  p  =  0.9  are 
now  shown  in  detail.  For  the  above  case,  P{Ai  U  /42  U  /I3}  taken  to  six  digits  is  0.083614, 
while  the  P{Aj  n  U  taken  to  six  digits  is  0.013293  for  all  j  larger  than  3  and 

*i)i2  €  Sj.  So  the  SCAUB  upper  bound  is: 

8  3  8 

^{U>s '’dJt  -I  n  (Ai,  u  i'i,f2  e  Sj 

j=l  j=l  j  =  4 

=  0.083644  +  5(0.013293) 

=  0.15011(rounded  to  five  digits) 

The  bounds  in  Table  1  do  become  significantly  better  as  k  becomes  larger.  The 
improvement  is  quite  dramatic  for  the  higher  correlations  of  0.9  and  0.99.  The  biggest 
improvements  occur  between  A;  =  1  and  k  =  2.  The  improvements  become  monotonically 
smaller  as  k  increases,  which  is  to  be  expected. 

If  variables  are  equicorrelated,  then  Dunnett’s(1955)  method  produces  the  exact  value 
for  the  probability  of  a  union.  This  exact  value  under  the  assumption  of  equicorrelation 
is  given  is  given  in  column  6  of  Table  1  and  can  be  compared  to  the  numbers  in  columns 
2,3,4  and  5  of  the  same  row  which  are  Theorem  1  upper  bounds  to  the  exact  value.  Under 
equicorrelation,  the  upper  hounds  are  close  to  the  exact  values  for  n  and/or  p  small.  It 
seems  reasonable  to  assume  that  for  a  given  set  of  variables,  even  without  equcorrelation, 
the  smaller  the  number  of  variables  and  the  smaller  the  absolute  values  of  the  correlation 
coefficients,  the  closer  the  SCAUB  inequalities  will  be  to  the  exact  values. 
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